Link Prediction Based on Graph Topology: The Predictive Value of the Generalized Clustering Coefficient

نویسنده

  • Zan Huang
چکیده

Predicting linkages among data objects is a fundamental data mining task in various application domains, including recommender systems, information retrieval, automatic Web hyperlink generation, record linkage, and communication surveillance. In many contexts link prediction is entirely based on the linkage information itself (a prominent example is the collaborative filtering recommendation). Link-structure based link prediction is closely related to a parallel and almost separate stream of research on topological modeling of large-scale graphs. Graph topological modeling builds on random graph theory to find parsimonious graph generation models reproducing empirical topological measures that summarize the global structure of a graph, such as clustering coefficient, average path length, and degree distribution. These well-studied topological measures and graph generation models have direct implications on link prediction. This paper represents initial efforts to explore the connection between link prediction and graph topology. The focus is exclusively on the predictive value of the clustering coefficient measure. The standard clustering coefficient measure is generalized to capture higher-order clustering tendencies. The proposed framework consists of a cycle formation link probability model, a procedure for estimating model parameters based on the generalized clustering coefficients, and model-based link prediction generation. Using the Enron email dataset we demonstrate that the proposed cycle formation model corresponded closely with the actual link probabilities and the link prediction algorithm based on this model outperformed existing algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Sampling from social networks’s graph based on topological properties and bee colony algorithm

In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...

متن کامل

An Efficient Predictive Model for Probability of Genetic Diseases Transmission Using a Combined Model

In this article, a new combined approach of a decision tree and clustering is presented to predict the transmission of genetic diseases. In this article, the performance of these algorithms is compared for more accurate prediction of disease transmission under the same condition and based on a series of measures like the positive predictive value, negative predictive value, accuracy, sensitivit...

متن کامل

Providing a Link Prediction Model based on Structural and Homophily Similarity in Social Networks

In recent years, with the growing number of online social networks, these networks have become one of the best markets for advertising and commerce, so studying these networks is very important. Most online social networks are growing and changing with new communications (new edges). Forecasting new edges in online social networks can give us a better understanding of the growth of these networ...

متن کامل

Sensorless Model Predictive Force Control with a Novel Weight Coefficients for 3-Phase 4-Switch Inverter Fed Linear Induction Motor Drives

The sensorless model predictive force control (SMPFC) is a strong method for controlling the drives of three-phase 4(6)-switch inverter linear induction motors. This kind of inverter can be employed for fault tolerant control in order to solve the problem of open/short circuit in 6-switch inverters (B6). This paper proposed a method for the SMPFC of a linear induction motor (LIM) with a 4-switc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006